perf + seo: 15x faster lookup SDK, sync all AI discovery surfaces to the real 11-tool MCP registry by dbwls99706 · Pull Request #127 · dbwls99706/deadends.dev

dbwls99706 · 2026-07-02T15:15:22Z

Summary

Project-wide review with two focused improvements: a lookup SDK performance fix and a full sync of every AI-agent discovery surface so agents are steered to the real capabilities of this project.

1. Lookup SDK: pre-compile canon regexes (~15x faster)

generator/lookup.py recompiled all 2,339 canon regexes on every lookup_all() call — the stdlib re module caches only 512 patterns internally, so the cache thrashed on each pass (~260ms per lookup). The MCP server (mcp/server.py) and Vercel endpoint (api/mcp.py) already pre-compile; the SDK was the only path that didn't.

Compiled patterns are now cached index-aligned with the canon list and invalidated by list identity (tests that swap _CANONS_CACHE still rebuild correctly).
~260ms → ~17ms per lookup after warm-up; identical match results. batch_lookup of 10 messages: ~2.6s → ~0.4s.
Invalid regexes warn once at compile time instead of on every call.

2. AI discovery surfaces: fix 8/10-tool drift, advertise country knowledge

The MCP server exposes 11 tools, but most discovery files advertised 8–10, and none surfaced the country-scoped dataset (the project's key differentiator):

New MCP_TOOL_NAMES constant in generator/build_site.py is the single source of truth for: llms.txt, .well-known/ai-plugin.json, .well-known/agent.json (A2A), .well-known/mcp.json, .well-known/mcp/server-card.json, site CLAUDE.md, .cursorrules/.windsurfrules/.clinerules, site AGENTS.md, and the homepage ai-summary block.
agent.json: adds list-errors-by-country, get-country-summary, report-outcome skills (8 → 11).
server-card.json: adds report_outcome; version synced to 1.6.0 (matches mcp/server.py serverInfo).
Agent config files now advertise country endpoints (/api/v1/countries.json, /api/v1/country/{cc}.json) and per-domain context slices (/llms-full-{domain}.txt) so agents can pull bounded, targeted context.
Homepage ai-summary: adds COUNTRY_INDEX/COUNTRY_PATTERN lines; tool list rendered from the constant.
Repo docs: root AGENTS.md and CLAUDE.md tool lists corrected to 11.

3. Housekeeping

Fixed ruff E741 in scripts/collect_github_signals.py — ruff check . is now fully clean.

Test plan

ruff check . — clean
python -m pytest tests/ — 300 passed (296 existing + 4 new)
- New: regex-cache reuse / invalidation-on-swap / invalid-regex-skip regression tests
- New: MCP_TOOL_NAMES == mcp/server.py TOOLS drift-prevention test
python -m generator.validate --data-only — passed (2,339 canons)
Full site rebuild + python -m generator.validate --site-only — passed
Verified generated artifacts: all 5 agent config files, 4 .well-known manifests, llms.txt, and homepage now advertise 11 tools + country endpoints
CLI smoke test: python -m generator.lookup "ModuleNotFoundError..." returns identical matches

🤖 Generated with Claude Code

https://claude.ai/code/session_01BgnR294a753jA4h4yDbJT9

Generated by Claude Code

- generator/lookup.py: cache compiled regexes alongside the canon list, invalidated by list identity so tests that swap _CANONS_CACHE still rebuild correctly. With 2339 canons the stdlib re cache (512 entries) thrashed, recompiling every pattern on every lookup_all() call: ~260ms -> ~17ms per lookup after warm-up. Invalid regexes now warn once at compile time instead of on every call. - tests: add regression tests for cache reuse, invalidation on canon swap, and invalid-regex skipping. - CLAUDE.md: MCP server docs said 8 read-only tools; server exposes 11 (list_errors_by_country, get_country_summary, report_outcome), and report_outcome writes to data/outcomes/. - scripts/collect_github_signals.py: fix ruff E741 (ambiguous name 'l'). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BgnR294a753jA4h4yDbJT9

AI agents were being steered by stale metadata: most discovery files advertised only 8-10 MCP tools while the server exposes 11, and none of them surfaced the country-scoped knowledge (the dataset's key differentiator vs generic LLM answers). - Add MCP_TOOL_NAMES to generator/build_site.py as the single source of truth for every emitted discovery surface: llms.txt, ai-plugin.json, agent.json (A2A), mcp.json, server-card.json, site CLAUDE.md, .cursorrules/.windsurfrules/.clinerules, site AGENTS.md, and the homepage ai-summary block. - agent.json: add list-errors-by-country, get-country-summary, and report-outcome skills (8 -> 11). - server-card.json: add report_outcome, sync version to 1.6.0 (matches mcp/server.py serverInfo). - Site config files now advertise country endpoints (/api/v1/countries.json, /api/v1/country/{cc}.json) and per-domain context slices (/llms-full-{domain}.txt) so agents can pull bounded, targeted context instead of skipping the 2MB full dump. - Homepage ai-summary: add COUNTRY_INDEX/COUNTRY_PATTERN lines and render the tool list from mcp_tools instead of a hardcoded 8. - Root AGENTS.md: add the two missing country tools. - tests: assert MCP_TOOL_NAMES == mcp/server.py TOOLS so the discovery surfaces can never drift from the real registry again. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BgnR294a753jA4h4yDbJT9

vercel · 2026-07-02T15:15:28Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
deadends-dev	Ready	Preview, Comment	Jul 2, 2026 3:15pm

dbwls99706 and others added 2 commits July 2, 2026 15:00

dbwls99706 merged commit db0df8d into main Jul 2, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf + seo: 15x faster lookup SDK, sync all AI discovery surfaces to the real 11-tool MCP registry#127

perf + seo: 15x faster lookup SDK, sync all AI discovery surfaces to the real 11-tool MCP registry#127
dbwls99706 merged 2 commits into
mainfrom
claude/project-review-analysis-jghi99

dbwls99706 commented Jul 2, 2026

Uh oh!

vercel Bot commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dbwls99706 commented Jul 2, 2026

Summary

1. Lookup SDK: pre-compile canon regexes (~15x faster)

2. AI discovery surfaces: fix 8/10-tool drift, advertise country knowledge

3. Housekeeping

Test plan

Uh oh!

vercel Bot commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant